Abstract: MapReduce is a programming model or software framework which is associated with the implementation of generating large data sets and their processing to a broad variety of real world task. Programmers computes in terms of a map and a reduce function. There are various programs written in the style that automatically functions parallel and are executed on large clusters of commodity computers. MapReduce jobs are executed on commodity computers every day, processing a total in petabytes of data per day.

Keywords: Big data, Data science, Map Reduce and Distributed Systems.